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ABSTRACT 

This monograph is one of a continuing series 
initiated to provide materials for teachers, parents- school 
administrators, and governmental decisionmakers that might encourage 
reexamination of a range of evaluation issues and perspectives about 
schools and schooling. This monograph is a description and analysis 
of two contrasting paradigms: one that now dominates the fieldof 
evaluation research, practiced by the great majority of academic 
researchers in education and the social sciences; and another, the 
"alternative paradigm," which has been ignored. The two paradigms are 
outlined by a sex of dichotomies to facilitate analysis and 
discussion: (1) qualitative vs. quantitative methodology; (2) 
validity vs. reliability; (3) subjectivity vs. objectivity; (U) 
closeness to vs. distance frqm the data; (5) holistic vs. component 
analysis; (6) process vs. outcome evaluation; (7) uniqueness vs. 
generalization; and (8) research for practitioners vs. research for 
scientists. Each of thess contributes one chapter to the total of 12. 
They are preceded by an introduction and followed by a chapter of 
conclusions and a bibliography. (DMT) 
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In November 11)72, educators from several parts jq£ the Uni- 
ted States met' at the University 'of North Dakota to discuss 
some common concerns about the narrow accountability ethos 
that had begun to dominate schools and to share what many 
believed ti be more sensible means of both documenting and 
assessing children's learning. Subsequent meetings, much 
sharing of evaluation information, and financial and moral 
support from the Rockefeller Brothers Fund have all con- 
tributed to keeping together what is now called the North 
Dakota Study Group on Evaluation. A major goal of the 
Study Group, beyond support for individual participants 
and programs, is to provide materials for teachers, par- 
ents, school administrators and governmental decision- 
makers (within State Education Agencies and the U.S. Office 
of Lducation) that might encourage re-examination of a • *-Jt • 
range of evaluation issues and perspectives about schools 
and schooling. 

Towards this end, the Study Group has initiated a 
continuing series of monographs, of which this paper is 
one. Over time, the series will include material on, 
among other things, children's thinking, children's lang- 
uage, teacher support systems, inservice training, ■ the * 
school's relationship to the larger community. The intent 
is that these ^papers be takdn not as final statements--a 
new ideology, but as working papers, written by people 
who axe acting on, not just thinking about, these problems, 
whose implications need an active arid considered response. 

Vito VevvonCi Dean 

Center for Teaching § Learning, 

University of North Dakota 
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Introduction 



This is a description and analysis of alternative evalua- 
tion research paradigms, or more specifically a descrip- 
tion and analysis 9f two contrasting paradigms: one that 
now dominates the field of evaluation research, practiced 
by the great majority of academic researchers in education 
and the social sciences; and another, which in this paper 
will be referred to as the alternative paradigm, that is 
rather like an ignored, illegitimate stepchild lurking in 
the shadow of the dominant paradigm* My purpose in this 
paper is to examine the alternative assumptions, values, 
Ldeology, and perceptions that inevitably undergird evalu- 
ation research methodology. It is a task whose importance 
is underscored by the recent explosion of interest in 
evaluating educational innovations and social action pro- 
grams . 

Part of this interest can be attributed to the bud- 
getary implications of evaluation results, part to the de- 
sire of the active public for evaluative information about 
government programs, part to the needs of program admin- 
istrators and participants for information about and 
evaluation of their own programs. As Edward A. Suchman 
has noted (1067) : !l The demand that some attempt be made 
io determine the effectiveness of public service and the 
social action programs has become increasingly insistent. 
. . .Thtf result has been a sudden awakening of interest in 
a long-neglected aspect of social research. .. 11 Indeed,, 
since 1967, the literature in the field has mushroomed, 
not only books (e.g.', Suchman, 1967; Weiss, ltf72a, b; 
Caro, 1971; Rossi ,/ 1972) , but numerous articles in major'- 
social science and education research journals. However, 
, what makes consideration of an alternative evaluation re- 
search paradigm so pressing is the fact that these pro- « 
minent exemplars of evaluation are based on a single, 
largely unquestioned, scientific paradigrti. 

The paradigm, which I refer to in the paper as The 
Scientific Method, derives from and is based on the na- 
tural science model. Over time it has emerged and been 
legitimated as the oirly path to cumulative scientific 

, knowledge. While specialists in evaluation research may 
Michael Patton i> a post- . ,* . . . ^ . 

doctor-l fellow ovalu- var > r in their em P h * Sls on cost-benefit analysis, expen- 

ation research methodoi- mental, design, multiple regression analysis, the construc- 

ogy and assistant proves- tion of simulated mathematical models, systems analysis, 

so'r of svitolog> at the survey research, standardized measurement, and inpufc-out- 

Univer&it; of Minnesota. put analysis, the underlying paradigm- -The Scientific Me- 
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.Lhud-rruiiuins basically the same for all of these tech- 
niques. 'Before pursuing the main concern of this paper, 
it may be helpful to briefly outline that Method as op- 
crationalized by some of its major advocates in evalua- 
tion research and suggest some reasons for its dominance 



1 , ^ __ - . 

The Dominant Paradigm 

A useful case in point is a study by Bernstein and Free- 
man (1974), a man.aoth evaluation of evaluation research, 
sponsored and published by the Russell Sage Foundation. 
Their focus included all evaluation studies directly fun- 
ded by agencies of the fpderal government in the fiscal 
year of 1970. They sampled the population of all large- 
scale social-action programs aimed at ameliorating some 
social problem in the areas of health, education, welfare, 
puM/ic safety,, (crime) , income .security, housing, and man- 
power and carrying a minimum research budget of $10,000. 
Their final analysis was based on 236 evaluation research 
projects. 

TABLE I. BERNSTEIN AND FREEMAN (1974) CODINGS OF 
EVALUATION QUALITY VARIABLES 



Variable Measuring Soma 
Aapeat of Evaluation 



Coding Scheme 
(where higher coding 
number represents higher 
quality) 



Nature of Research Design 0 = Descriptive Study 

*> ' 

1 = Comparative, longitudi- 
nal or cross-sectional 
studies without randomi- 
zation or control 



Representativeness of 
the* Sample • 



2 = Experimental designs 
witlfout both randomiza- 
tion or control 

3= Experimental designs 
with randomization and 
control 

0 = Haphazardly drawn 

samples 

1 a Moderately reprcsenta- 
, tivc 



Sampling 



4. Type of Data Analysis 



5. Nature ol Data Analysis* 



c 

6. Quality of Measurement 
Procedures** 



0 = Non-systematic, non- 

random, non-systematic 
random, and random or 
non-random cluster 
samples 

1 = Stratified random, 

simple random, or all 
(i.e. universe) 

0 = No statistics, ratings, 

or impressions 

1 = Narratives or impres- 

sionistic summaries 

2 = Rating from qualitative 

data 

3 = Simple descriptive sta- 

tistics 

4 = Multivariate statistics 

0 = Qualitative Analyses 

1 = Evenly divided between 

qualitative and quan- 
titative analyses 

2 = Quantitative analyses 

0 = Inadequate measurement 

1 = Adequate measurement 



^Explanatory Quote from Bernstein and Freeman: 

While there may be some debate as to the order we 
have imposed here, i.e. quantitative as higher than half 
quantitative ai.d half qualitative, we feel justified in so 
doing since most of the currert literature on evaluation ^ 
research methods, e.g., Suchm&n, E.A., Evaluative Research, 
W67, Russell Sage, N.Y.; Caro F. (ed>), Readings in Eval- 
uation Research* 1971 Russell Sago, N.Y.; Rossi, P. and 
Williams W. , Evaluating Social Programs* 1972, Seminar 
Press, N.Y.; and Sheldon, E.B. and Bernstein, I.N., "Me- 
thods of Evaluative Research 11 , in Social Science Methods* 
(ed>) Robert Smith, 1973, .Free Press, N.Y., strongly sug- 
gests that the best evaluations in termsof research qual- 
ity are those which are highly quantitative. 
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In reviewing the findings of the Bernstein-Freeman 
study, which set out to assess the quality of evaluation 
research projects, what is of immediate interest to us is 
the way the study measured 'quality 1 : the quality vari- 
ables they identified and measured represent a fully ex- 
plicit description of the dominant evaluation research 
paradigm. Table I shows how they coded their six major 
indicators of quality, with a higher code number represen- 
ting higher quality evaluation research. What emerges is 
1) experimental designs with randomization and control 
groups, 2) reliable and valid measurement instrumenta- 
tion, 3) representative samples that are 4) randomly selec- 
ted, and 5) sophisticated statistical analysis of 6) com- 
pletely quantitative data. 

Some might want to add to the Bernstein-Freeman list 
but few practicing social scientists would question the 
accuracy or validity of the research paradigm they describe 
as The Scientific Method. (Some may note with dismay the 
absence of any measurement to indicate whether the "infor- 
mation collected was relevant to the programs evaluated, 
whether the evaluation information was used by decision- 
makers and program participants, whether the outcomes mea- 
sured were those held to be important by program funders, 
administrators, and participants, or whether the evalua- 
tion design and results were understandable to those for 
whom the evaluation was conducted. The Bernstein-Freeman 
paradigm, however, makes no pretense of addressing such 
questions; in the dominant paradigm such questions are riot 
of central methodological interest, a point to which I 
return later.) At a conference on evaluation and policy 
research sponsored by the American Academy of Arts and 
Sciences in 1969, Peter Rossi reported general consensus 
about the most desired evaluation research methods. The 
concensus was virtually identical to the model found most 
desirable by Bernstein and Freeman. A cursory skimming of 
major educational and social science research journals 
yields a similar 'lack of disagreement. In their widely 
used methodological primer, Campbell and Stanley (1966:3) 



**E.\planatory footnote from Bernstein and Freeman: 

The satisfaction of a measure having adequate con- 
tent validity as it appears in Kerlinger, Fred, Founda- 
tions of Behavioral Research, 1964, N.Y.: Holt, Rinehart 
and Winston, pp. 444-447. An example of response which 
was coded adequate was: "The criteria by which the effec- 
tiveness of an educational program aimed at increasing 
cognitive abili* * of mentally retarded children was t the 
use of standardized reading comprehension, vocabulary, and 
arithmetic tests, all of which had been pretested for re- 
liability on other similar target populations. Five re- 
peated measures were taken ov^r a two-year period. 11 * 



^ail tin - paradigm "the only available route to cumulative 
progress/' * , 

What abcounts for such certainty, or at any rate such 
acceptance, erf an intellectual construct, nt a time when 
natural scientists themselves are reexamining their most 
fundamental propositions? The answer to that question may 
not be so inaccessible. As < din (1970: 80 J explains, ''A 
paradigm governs, in the first: instance, not a subject 
matter iKit rather a group of practitioners/' Those prac- 
titioners most committed to the dominant paradigm are 
found in the universities where they not only vmplqy 
Scientific Method in their own evaluation research, but 
.vhe^re they aHso nurture students in. a commitment to that 
same methodology (cf. Bernstein and Freeman/ 1974). : 

There are other reasons for the dominance of the na- 
tural science model, reasons that go somewhat beyond the . 
merits of the Method. William J. Filstead (1^70:3-4) sug- 
gests such reasons as "ego fulfillment; the achievement of , 
Scientific respectability; the quest for social status on , 
a par with that of natural scientists; and gran,tsmanship, 
uhich, although it is not necessarily helpful in ascer- 
taining the validity of the data, does enhance both those 
;.ho collect data in the appropriate fashion and the dis- . 
cipline that fosters adherence to those appropriate me- 
thods of data collection. 1 ' 

Mule there can be some argument about the reasons 
for the dominance of the natural science model in educa- 
tional social scientific research, the fact of the domin- 
ance cannot be seriously doubted. The issue for us is 
that *>V .v**v <1oriin<xncc of The Scientific Method in eval- 
?. 3> M>?1 arveai'o to have cut off the great major- 
*rv c m rractftioneiw from serious consideration of* 
r/.u St-;:*- ecwaveh pavaJign* The label 'research J 
ha"; come to mean the equivalent o^ employing The Scienti- 
fic Method— of working within the dominant paradigm. 
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The Alternative Paradigm 
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ihe di^/ussion that follows is> focused on broad epistemo-/ 
logical, contrasts.* The alternative (methodological) pay 
Fadtgm that I 4 Shall discuss is driiwn togetlfer from a nip- ' 
Wr of emerging dLr.ections--trends, ideas, approach qsJ me- 
, „thods, and perspectives that are not always clearly arti- 
'^hiiatbd by their adhexqnts. It drays on work-in qualita- 
•tJVc "methoJpl^gy ^phenomenology, symbolic interac^ionism, 
f^j + jli psychology, eth'nomethodology, and the general' no- * 
Xiot\eM' doctrine of ^evstshen. Kenneth Strike (1972;/.8) v 
described thUs tra<ition„Ss follows: * 

. * ? V 1 N * t 

The basvc .dispute clustering around the notion*of «/* 

t* N .Ki>-!ekJ/ha$ typuQ.al.ly sounded something like tkc 

following: Vi>hp advocafe^of- sdme* version of the" v ^ 
\:t',;:*keh doctrine ^ill claim "thcUf-human ^beings 
* 'yctu^be- understood in a manner that atlye^r objects 

•* # of •srtuily caAiitfK.v ^cn, I)<ive purposes' and eifStions, 
the;/ make plans, iortstruct. cultures, and hold cer- 

, ^ tain ,valives, and tlreir ibehavior is influenced by 
>uch Yarfues^ pjanjs^^ftd purposes. In short, a hu- 
m^n b?ing Ijves jrti 'af world which has "meaning" to 
hia/aml, because his behavior has meaning, human 
dct«iohs are intelligible in. ways that the behavior 
of nonhuman objects is npt. The opponents^of this, 
view, on the other'hand, will maintain that huntfm 
behavior- is to be explained in the* same manned as » 
is the behavior of other objects of nature. There 
are laws governing, ^uman behav.ior. An action is 
explained when it can be subsumed under sonve such 

\ la::*, and, of course, suJi. laws are confirmed by 

- >irical evidence. / . t ' 

' ■ A, 1 • 

I he alternative paradigm stresses understanding that 

f focuses on the* nooning of human < behavior, the context of 
social interaction*, aw emphatic understanding of subjec- 
tive* (mental, not nonobjective) states, and the connec- 
tion between subjective states and behavior. Filstead ex- 

# ptains that the tradition of vers-tchen or understanding 
"has h,id its* greatest 'influence in formulating the posi- 

. tion that recognises the importance of both an inner and 
an outer perspective, of human behavior ... .The inner per- 
spective places emphasis on man's ability to know himself 

1 an3, hence, to.-know and understand others through 'sympa- 
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nomena to be observed... 
that generates interest 



thctic introspection," and imaginative reconstruction 1 of 
'definitions of the situation. 1,1 

The alternative paradigm proposes an active, invol- 
ved role for the social scientist/evaluation researcher. 
"Hence, insight may be regarded as the Core of social know- 
ledge. It is arrived at by being on the inside of the phe- 

.It.is participation in an activity 
purpose, point of view, value, t 
meaning, and intelligibility, as well as bias" (Wirth, 
1949:xxii). As Filstcad (1970:4) says, "this in no way 
suggests that the researcher lacks the' ability to be sci- 
entific wuile collecting the data. On the contrary, it 
merely ipecifies that it is crucial for va"lidity--and, con- 
sequently, tor reliability—to try to picture the empirical 
social world as it actually exists to those under investi- 
gation, rather than as the researcher imagines it to' be." ( 
More concretely, the alternative paradigm relies on field 
techniques from an anthropological rather than natural sci- 
ence tradition, techniques such as participant observation, 
in-depth interviewing, detailed description, and qualita- 
tive field notes. * / 
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Opposing Paradigms 



i 

I have now described the broad outlines of two contrasting 
evaluative research paradigms. It is the task of the re- 
mainder of this paper to sharpen these contrasts, to bring 
them into high relief, to make them appear as opposites. 
Such an analysis, based on non-existint ideal-types, will 
clearly overstate the case. Tocit understandings about 
flexible parameters will here appear as absolute rules of 
procedures. Areas of mutuality, common concern, and simi- 
larity of commitments will be largely ignored. 

The justification for such an approach can be found 
in the very nature of paradigms. A paradigm is a world 
view, a general perspective, a way of breaking down the 
complexity of the real world. As such, paradigms are deep- 
ly embedded in the socialization of adherents and practi- 
tioners telling them what is important, what is legiti- 
mate, what is reasonable. Paradigms are normative, they 
tell the practitioner what to do without the necessity of 
long existential or epistemological consideration. But it 
is this aspect of a paradigm that constitutes both its 
strength md its weakness—its,, strength in that it makes 
action possible, its weakness in that the very reason for 
action is hidden in the unquestioned assumptions of the 
paradigm. It is to raise these assumptions to the level 
of consciousness among evaluation researchers that this 
analysis is undertaken. The difficulty of this task is 
clear from Kuhn's description of the power of paradigms: 

Scientist* work from models acquired through educa- 
tion and through subsequent exposure to the litera- 
ture often »'ithout quite knowing or needing to know 
what characteristics have given these models the 
status of community paradigms. And because they do 
so, they need no full set of rules. The coherence 
displayed by the research tradition in which they 
participate may not imply &ven the existence of an 
underlying body of rules and assumptions that addi- 
tional-historical or philosophical investigation 
might uncover. That scientists do not usually ask 
or debate what makes a particular problem or solu- 
tion legitimate tempts us to suppose thai* at least 
intuitively^ they know the answer. But it may only 
indicate that neither the question nor the answers 
aye felt to be relevant to their research. Paradigms 
may be prior to > more binding > and more complete 



i 4 *v > : w "V *" *h " . (Kuhn, 1970:46.) 

It is because "paradigms may be prior td', more bind- 
ing, md more complete than any set of rules for research 
taut can be unequivocally abstracted from them" that the 
analysi* here will focus 01^ dominanf motifs-, modalities 
of thought and action, and illumination of tacit under- 
mdings. Ihe dichotomies constructed will be aimed at 
capturing the unuerlying and fundamental elements in the 
t,.o puradigns which are the bases of their opposition and 
competition, 

\t the outset I considered the possibility of at- 
tempting to describe and contrast the two paradigms in a 
neutral fashion. However, the very dominance of one para- 
digm, the natural science model, and the subordination of 
the .second paradigm, the alternative paradigm, convinced 
me that it is more important to attack this imbalance than 
to maitftain neutrality. My concern here is two-fold: 
first, I am concerned that practitioners and adherents af **S, 
the dominant paradigm show little awareness of or con- 
sciousness about > own the existence of an alternative pa- 
radigm*; and secondly, I am concerned that practitioners 
■ »f the dominant paradigm seem ti be insensitive to and un- 
aware of the degree to which their methodology is based 
upon a relatively narrow philosophical/ideological/episte- 
nioiogieai view of the world. "It is important, " Mills 
wrote, T, to get this point quite clear, for one would sup- 
pose that philosophical tenets would not b6 central to the 
shaping of an enterprise which is so emphatic in'its claim 
to be Science, It is important also because the practi- 
tiorftri- of thq style do not usually seem aware that it is 
a philosophy upon which they stand" (Mills, 1961:56). 

It is in this context that I wish to approach the 
following discussion of evaluative rqsearch paradigms. 
Iliu assets of the alternative paradigm need to be stressed 
And the shortcomings of the dominant paradigm need to be 
seriously examined for the majority of evaluation re- 
searchers seen\ to be oblivious of the assets of the for- 
mer, and euphoric about the techniques of the latter. 
Hubert Bl umer (1969:47) put the issue this way: "This - 
opposition needs to be stressed in the I\ope of releasing 
.social scientists from unwitting .captivity to a fo'nqat of 
inquiry tha,t is taken for granted as the naturally proper 
ty in which to conduct scientific inquiry " . 

\s a final introductory note I would add that tfiere 
i, a tension in thi s" analysis between the abstract and * 
the concrete. I have "tried to^overcome it by illustra- 
ting the points -of paradigm opposition with .examples 
from the literature on educational evaluation, drawing 
particularly on evaluations that have tq <U> ^i th °P en 
education and other alternatives to traditional school- 
ing. . These examples help make a point that runs through - 
nut thi> analysis: If -there can be clarity about the 
it-- -J to adjpt* evaluation methods that suit the nature of 

t;»e pfTi;ram hf»lng evaluated, the contrasting natures of 

i \ 
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tiailitioinJ and open education programs suggest a need 
tor vor.t im evaluation strategies and techniques, I 
shall pursue thi> point throughout the analysis that 

fo 1 1 Ovv s , 
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Qualitative v$\ Quantitative Methodology 



kuhn (1970:184-5) in his discussion of science paradigms 
notes that the values held by scientists function to help 
them choose between incompatible ways of practicing their 
discipline and that "thq most deeply held values concern 
predictions: they should be accurate; quantitative pre- 
dictions are preferable to qualitative ones " Kuhn is 

writing mainly about natural scientists, but it is clear 
that the values of natural scientists concerning predic- 
r tion have been enthusiastically embraced, by social scien- 
tists and educational researchers:"* Not only are quanti-, 
tativo* predictions preferabl<£/to qualitative ones, but 
V qualitative analyses in general have little legitimacy be- 
yond certain limited exploratory situations^ ^ 

The. art . and science of quantification^tonstitutes 
the very core*of the dfominant paradigm. To turn words in- 
to numbers, historical trends into prediction equations, 
and the behavior *of peeple into probability tables and 
standardized regression coefficients—these are the 
.greatest miracles in Science, and to the performers of 
those miracles go the greatest of alb Scientific rewards: 
recognition and high status. ( 

The methodological status hierarchy in v Science is 
clear: the hardcl* the data, the more, scientific the re- 
sults and the higher the status. (By "hardness of data" 
is meant the degree to which^you can assign numbers to 
what you are studying and manipulate those numbers using 
sophisticated statistical techniques.) Thus, economics 
outranks sociology as Science. Within sociology the de- 
* mographers, empiricists, and quantitative methodologists 
rank at the top of the methodological status hierarchy; 
ethnomethodologists, participant ^observers , and qualita- 
tive methodologists occupy the lower parts of the hier- 
' archy, meaning they have mpre difficulty getting their 
work published, greater problems on the job market, less 
agility at attaining tenure and promotion, and greater 
difficulty obtaining research grants. The same methodo- 
- logical status hierarchy rules other disciplines, inclu- 
ding Schools of Education. 

The foregoing -is not meant* as an across-the-board 
attack on the use of .statistics in evaluation research.. 
The problem is the use of statistics to the virtual ex- 
oLi&ion of other types of data. In this regard, C. 
Wright Mills f 1961: 50) observed that the dominance of^ 
• statistical methodology has led to a "methodological in- 

• LG 



lithMiuii" that he called "abstracted empiricism. 11 The 
problem with abstracted empiricism is that "it seizes 
upon one juncture* in the process of work and allows it 
to dominate the mind. 11 

. The dominance of quantitative methodology has acted 
to severely liir t the kinds of questions that are asked* 
and the types of problem* that are studied. Mule most 
phenomena are ih»t necessarily intrinsically impossible to 
measure quantitatively, certain types of phenomena are 
clearly easier to measure numerically than others. It is 
easier, for example, to measure the number of words that 
a child spells correctly than to measure that same child 1 
ability to use those words in a meaningful way. The vast 
majority of educational researchers have clearly opted 
for the first procedure. It is easier to count the num- 
ber of minutes a student spends reading books in clas.s 
.than it is to measure what reading means to that child. 
We have a iaVgc number of studies of the former, but we 
know little about the latter. 

Quantitative mefhodology assumed the possibility, 
desirability, and even the "necessity of applying some un- 
derlying empirical standard to social phenomenon. Thus, 
an underlying standard of measurement can be applied to 
measure the wavelength of blue light. But qualitative 
methodology assumes that some phenomena are not amenable 
to such mediation. While you can measure the length of 
blue light, can you capture in quantitative notation 
what the color blue looks and feels like? The experience 
of looking at blue light is a direct encounter between 
phenomenon and observer; it is r\ot easily amenable to 
statistical measurement. 

The point- here is that different kinds of, problems 
revive liffercnt types of research methodology. If all 
t*e want to know is the number of words a child can spell 
or the frequency of interaction between children of dif- 
ferent races in desegregated schools, then statistical- 
procedures are appropriate. However, if we want to un- 
derstand 'the relevance of the words to that child's°par- 
ticular life or the meaning of inter-racial ii>|eractiops 
tffen some form of qualitative methodology (participant 
observation),/! in-depth interviewing, systematic field 
work) which allocs the researcher to obtain firsthand * 
.knowledge about the empirical .social world in^question 
may well b'e more appropriate. Mills (1961:73-74) has % 
stated this approach quite succinctly: 

If the problems upon which one is at work are readily 
amenable to statistical procedures, one should always 
try to use them". .♦.No one, however, need accept such 
procedures, when generalized, as the only procedure 
available. Certainly no one need ac.cept-this model 
as a total canon. It is not the .only empirical 
manner. 

It is a choice made according to the requirements 
of our problems, not a 'necessity 1 that follows from 
an epistomological dogma. 
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V\ extended example may help illustrate the impor- 
tance of seeding congruence between the phenomenon stu- 
died, -and the research methodology employed for this study. 
Ihe example, a major study frequently quoted, concerns the 
key issue of whether or not educational innovation makes a 
difference in children's* achievement. After examining 
some four decades of educational research, John Stephehs 
liyt>7) concluded that educational innovation makes little 
difference, "But, "asks Edna Shapiro (1973:542), "can 
such a judgment be made when the researcher has sampled 
only an extremely .narrow band of measurement within a 
constant and equally restrictive situation? 11 

Shapiro asked this question a£ter finding no differ- 
ences in achievement test scores between 1) children in 
an enriched Follow Through (FT) program modeled along the 
lines of open education and 2) children in comparison 
schools not involved in Follow Through or other enrich- 
ment programs, jflxen *hc children's responses in the test 
ciim\r jW'WkI, no differences of /my consequence 

• v fxt> L ,i':.v; vr, uhen observations were made of the 
<r> f <\ their clissrco^s > there >Jcre striking differ- 
the Follow 1'hrjHjh and comparison classes: 

* * A satisfactory explanation of the outcomes of this 
s>tud>* raises general questions about assessing the 
• inpact of educational programs. Other studies may 
he more elaborately mounted, more carefully con- 
trolled, more elegantly analyzed, but , the basic is- 
sues remai'n the same. In this study, when we ob- 
served the children in^their classrooms, ther$ were 
striking differences between the FT and comparison 
classes; when we compared the children's responses 
in the test situation, there were no differences 
of any consequence. Conventional explanations would 
make little of the classroom differences, stressing 
the absence of difference in individual test response. 
Ihe Conventional explanation for equivocal findings 
(and. they are not unique— the educational' research 
literature is replete with, negative findings) is that 
the programs being compared do not make a difference, 
that the research design was inadequate, or that it 
is naive to cxpecf ^differences since program varia- 
tions dc-not make k\ noticeable difference. N My con- 
tent ion is that such explanations "do not go far 
- >/>.. *n*U it la importvtt to try to explain ne- 
~c 'V **c 4 reoA 7 tfi 3 it is far move 'important to ee- 
\ ( , » • ;>, Usparity between the negative tost 
*v- p i if c c?-ar differences observed in class- 
iv -» l.fn i:i,r. (Shapiro / ,!973:S27.) 

^JJased on systematic observations the Fol low Througn 
classrooms "were characterized as liyely, vibrant, with a 
diversity of curricular projects and children's products, 
and an atmosphere of friendly, cooperative endeavor. The 
noti-M classrooms were characterized as relatively un- 
eventful, with a narrow range of curriculum, uniform ac- 
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livity, i great deal of seat work, and less equipment; 
teachers as well as children were quieter and more con- 
cerned with maintaining or submitting to discipline. 11 
(Shapiro, p. 529.) Observations also revealed that the 
chiluren behaved differently in these two types of en- 
vironments. U*t standardized achievement* tests failed to 
dvtect these differences. Shapiro (p. 532) suggests that 
"there were factors operating against the demonstration 
of differences," which call into question traditional ways 
of gauging the impact -and effectiveness of different kinds 
of school experience. The testing methodology* in fact* 

«»ivwi the nature of the questions that were being asked 
a>:i vperdetc&'iiKed nonsignificant statistical results* 
Shapiro's analysis of how the quantitative methodological 
procedures determined the research results are so insight- 
ful and -so important that we quote her at length: x 

Stupes of the effectiveness of different kinds of 
educational programs share a common methodology: 
children of comparable background and ability are 
exposed to or participate in experiences which vary 
in certain ways and are subsequently tested on as- 
pects of learning or performance presumed to de- % 
monstrate the impact of the differences in their 
experiences. . . ♦ 

In this study, "too, the chiPd's responses in the 
test situation were considered critical. What chil- 
dren do in the classrcom--the kinds of questions 
they ask, the kinds of activities they engage in, 
the kinds of stories, drawings, pogms, structures 
,they produce, the kinds of relationships they de- 
velop with other children and the teacher—indi- 
cates not only what they are capable of doing but 
what they are allowed to do. Classroom data are 
generally down-graded in attempts to study the 
effects of educational program.^ because 'we cannot 
know whether the comparison group, given the same 
opportunities, would behave in similar ways. And 
conversely, we do not know whether, if the oppor- 
tunity were removed , there would be any carry- 
over to a new classroom situation, that is, whe- 
ther the effects have beon internalized. Nor is 
it easy to separate the contribution of and effect 
upon individual children in the-group. Following 
the line of reasoning of an earlier stuuy, I as- 
sumed that the internalized ef fects^ of different 
kinds of school experience could be* observed and 
inferred only from responses in test situations, 
and that the observation of teaching and learning 
in the classroom should be considered auxiliary 
. information, useful chiefly to document the dif- r 
ferences in the children 1 s group learning experi- 
ences. 

The rationale of the test, on the contrary, is 
that each child is removed from the classroom and 
treated equivalentiy," and differences in response 



•ire presumed to indicate differences in what has 
been taken in," made one's own, that survives the 
shift to a different situation. , 

The findings of this study, with the marked dis- 
parity between classroom responses and test re- 
sponse?, have led tne to reevaluate this' rationale. 
This requires reconsideration of the role of class- 
room data, individual test situation ::data; and the 
relation between them. If we minimize the impor- 
tance of the child's behaviov in the classroom be- 
jjLiJC it is influenced by situational variables, 
Ic tic not have to apply the same logic to the 
Snild's responses in the test situation > which is 
influenced by situational variables? ' 

The individual's responses in the test situation 
have conventionally been considered the primary means 
to truth about psychological functioning. Test be- 
havior, whether considered as a sign or sample of un- 
derlying function, is treated as a pure measure. .Yet 
the test situation is a unique interpersonal context 
■ in which what is permitted and encouraged, acceptable 
and unacceptable, is carefully defined, explicitly 
and implicitly. Responses to tests are therefore 
*iaic under very special circimistances. The variables 
that influence the outcome cere different from those 
Jnljh operate in the' classroon, but thv^ttotion that 
the standard test or interview provides equal treat- 
w,t for all subjects is certainly open to question. 

(Shapiro, pp. 532-534.) 

/ 

Shapiro elaborates and illustrates these points at 
considerable length. Her conclusion goes to the heart of 
the problem posed by the dominance of a single methodolo- 
gical paradigm in evaluation research: • "Research metho- 
i:l:au nust be suite J to the particular* characteristics of 

the "situations under study An omnibus strategy will not 

(p. 543, italics added). 

Most social scientists do not deny the immense heur- 
istic value of qualitative data. What they do deny is that 
qualitative methodology can be a, legitimate source of either 
data collection, systematic evaluation, or theory construc- 
tion. At best, social scientists are willing to recognize 
that qualitative methodology may be useful at an explora- 
tory stage of research prefatory to quantitative research. 
However, M to force all of the empirical world- to fit a 
scheme that has been devised for a given segment of that 
world is philosophical doctrinizing c-uid does not represent 
the approach of a genuine empirical science." (Blumer, 
1969:23). - . 

There is indeed a viable alternative to the dominant 
natural science model, an alternative that not only employs 
different methods but also asks different questions. And, 
a> Kuhn has explained, one of the functions of scientific 
paradigms* is to provide criteria for choosing problems that 
can be assumed to have solutions: "Change in the standards 
governing permissible problems, concepts/ and explanations 



v m transform a science" lp. 106). It is the failure of 
th« Jt >miiiant natural .science paradigm to answer important 
questions like those raised by Shapiro that makes serious 
consideration of the alternative paradigp so crucial for 
evaluation research. 



Reliability 'vs. Validity 



Any consideration of paradigms in science must focus on 
dominant motifs and patterns. Paradigms tell scientists 
what to emphasise, what to look for, what questions to be 
concerned with, and what standards to apply. Competing 
P'M\xdijrns raise quest Lots of emphasis* It is the conten- 
tion/ of this paper that thejdominant paradigm in scienti- 
fic research, with its quantitative emphasis, has been 

^preoccupied with reliability, while the alternative para- 
digm emphasizes validity. 

Reliability concerns the replicability and consis- 
tency of scientific' findings. One is particularly con- 
cerned here with inter-rater, inter-item, interviewer, 
observer, and instrument reliability. Validity, on the 
other hand, concerns the meaning and meaningfulness of 
the data collected and instrumentation employed. Does - 
the instrument measure what it purports to measure? Does 
the data mean what we think it means? 

Merton (1957:448), one of the most prominent god- 
Lathers of sociology, argues that the cumulative nature _ 
of science requires a high jdegree of consensus among sci- 
entist's and leads, therefore, to an inevitable enchant- 
ment with problems of reliability. With the proposition 
that scientific research has been preoccupied with ques- 
tions of reliability, I can agree; but I part company 
with the proposition that such a preoccupation is neces- 
sary and good. 

Irwin Deutscher (1970:33) has stated the problem 

with great cogency: 

We 'have been absorbed in measuring the amount of 
error wl,iich results from inconsistency among inter- 
viewers or inconsistency among items on our instru- 
ments. We concentrate on consistency without much 
concern with what it is we are. being consistent 
about or whether we are consistently right or wrong. 
As a consequence we may have been learning a great 
deal about how to pursue an incorrect course with a 
maximum of precision. 

It is not my intent to disparage the importance 
of reliability per se; it is tha obsession with it 
to whidi I refer. Certainly zero reliability must 
result in zero validity. But the relationship is 
not linear, since infinite perfection of reliabi- 
lity (zero error) may also be associated with zero ' 



18 



2.x 



validity. Whether or not one wishes to emulate 
the scientist and whatever methods may be applied 
to the quest for knowledge, we must make our esti- 
mates of, allowances for, and attempts to reduce the 
extent to which pur methods distort our findings. 

The problem with the standardized tests in Shapiro ! s 
study of open education Follow Through classrooms was not 
that they were unreliable, but that they were not valid 
measures of the learning taking place in those classrooms. 
Yet any suggestion that standardized tests may be an inap- 
propriate measure of learning is met with outraged accusa- 
tions that reliability of measurement is being sacrificed. 

At the same time, validity has become a function of 
frequency of use of some instrument. The often-used and 
highly reliable instrument takes on a sanctity that places 
it above question. After a while we lose sight of the ac- 
tual behaviors that are supposed to be associated with the 
.instrument. "The widespread misconception about the so- 
called IQ provide a particuljurly flagrant example of "such 
a dissociation. One still hWrs the* term 'IQ 1 used as 
though it referred, not to a test score, but to a proper- 
ty of the organism" (Anastasi, 1973:xi). 

When one actually looks at the operational defini- 
tions and measures of major educational and social scien- 
tific concepts, one sees that their transparency and bias 
are frequently astounding though their reliability is ex- 
tremely high. In addition, we seem to have lost sight of 
the fact that responses mean different things in different 
settings, and different contexts. (The only way to discern t 
such variations in shades of meaning is to directly inter- 
act with and observe respondents in various relevant set- 
tings.) Thus, instruments prepared for evaluation in one 
setting are adopted for evaluation in other settings with 
a facility that shows arrogant insensitivity to the issue 
of cross-setting validity. This does not mean that every 
evaluation must include development of new instrumentation. 
But ejery evaluation must include some effort to establish 
the validity of the instrumentation adopted for the set- 
ting in which it is used. 

The alternative evaluation paradigm makes the issue 
of validity central by getting close to the data, being 
sensitive to qualitative distinctions, attempting to de- 
velop empathy with program participants and thereby ap- 
proaching the -data subjectively, and taking a holistic 
and process perspective on evaluation (issues taken up 
later in this paper). The overriding issue in : the vers- 
tehen approach to science is the meaning of the scien- * 
tist's observations and data, particularly its meaning for 
participants themselves. The constant focus is on a va- 
lid representation of what is happening, not at' the ex- 
pense of reliable measurement, but without allowing reli- 
ability to determine the nature of the data. 

Discussion of varying emphasis on reliability and 
validity in the two paradigms is particularly difficult 
because the ideal in both paradigms is high xeliability 
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iur.li k .ihTity. Nevertheless, differences in empha- 
in tin.-' t*.i.o paradigms jirc clearly discernible* The 



differences are a natter of emphasis and attention, but 
it Ls of such differences that alternative paradigms are 
nude . 
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. OpfcctirHv. vs. Subjectivity 
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Object iv i t>\%ij;.* considered .theVfrce qua rion of* Trie Sci- 
entific Method /^Qualjddtiv} methodology and a phenom- 
^noddgi^al approach- to evaluation research*,^ on the other 
hand ,* most frequently jfettmOl'ate charges of. subjectivity-- 
v r a lafcel held^to.^e tjyj t very antithesis of scientific ln- 
■ quiry. To>l^esub}e"ctive means* to be biased, unreliable, r 

• * t ' and non-rational. Suttjective^ data imply opiniron rather m .*» • ; 

tthan fact, ^intuition rather'thari"* logic, impression Va- V *. 

• * ther than confirmation.* Social, scientist? ^e encour,-*. w ?« 

• r*aged t«? eschew stibjectivity-in^avoy of making^their wo.rK *' . 

''ftfrjectivV and vaJ^ue-freeV 1 / " * «■ * * v . 

v A \ Some /v\^uatisn i rosetir£hers A recognize that social , 
. ' ' action research m^ty lak£ c*ie so close to questions of po- u * « 
* t litics .and va-lues, that^ it may be impossible to completely 
tflftniriato- subjectivity. -UAder thes^e Conditions "the 4 * task 
*fu5kth"e development of evaiuatftVe .research as a 'sci&nfri-'T 
% . j^icK process rs %o wt control f ; this intrinsic subjectivity/** 

since it cannot' be AiminatS^l. r.% to^examine tho priTiCi- > 
jpior> and pfoj^KKTre^rxhat; man h as .ildve loped for controlling 
*ub-octivity— the scientif Lc- mstlio'd.f^T ." (Suchman, 1967^ 4 

*\ Not surprisingly, the means for ^nt^rolling su 'ec- 

\ tivity through' the scientific' method are the techniques of, 
the dominant paradigm, particularly quantitative method- 
ology and emphasfs ^Reliability. Yef^ we -have' already # 
argued that'quantitative methodology *wofks , in practice,' 
to li iit and even bias the^ kinds' of Questions tirfftTcan bcf.^., 
asked and the nature of admissible* soJut'i* as . * In effect-, vV * 
identifying objectivity as the major virt e of.* the cfomi- 
nant paradi^n is an' ideological statement the function, of 
, * Uuch is to legitimize, preserve, and protect the dqmi-* f^f* 

,nance of a single evaluation methodology. * ^ , 

Michael Striven (1972:94) asserts that quantitative^,, * 
methods are no more synonymous with objectivity than qual- " 
itative methods are synonymous with subjectivity. ' "Errors 
like this are too simple t© be explicit. .They are infer- . - 
red confusions in the ideological foundations of research,- 5 
* its interpretations, its appl'catiops.U Scriven goes on ? 
, to coriment that "it is inprc;> \ingly clear that tlvQ inf lu- 
\ ence of ideology on methodology and-of the latter' on the 
training and benavior of 'researchers anti on the identifi- 
cation and disbursement of support is staggeringly power-/ 
ful. Ideology is to research what. Mar^ suggested the % * \ 
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I'vonounc factor was to politics and whit Freud took sex> 
to be for psychology. V 

Scrivert's (1972) discussion of ''Objectivity and Sub- 
jectivity in Educational Rpsear.ch" is a major , contribution 
!n the* struggle to detach the notions 01 objectivity and 
subjectivity from their traditionally narrow assbciations 
ui'th quantitative and qualitative methodology, respec- 
tively. He presents a cogent argument for recognising as 
legitimate sqience not only the prediction of social phe- 
nomena but also, and perhaps even more important ?i recog- 
nising as science the pursuit of understandftig--yt?rs£<* 5 iott. 
The quest for social prediction in the same sense as pre- 
diction operates in the classical natural Science* para- 
digm is f, pipe dreaming" (p, 115). .The pr.aetice of Science 
has. led to a fprmalistic split between the mental "and ttiel 
logical, seen as the subjective and the objective, which* 
keeps researchers from seeing that "understanding, pro- r 
perly conceived, is in fact an 'objective* state of n\ind 
or brain and can f be tested quite ^objectively; and *it is a 
'functional and crucial state of mind*; betokening the pres- 
cence of skills aitd states that are necessary for survival 
in the sea of information. There is nothing wrong with 
saying, in this case, that we have'simply^developed an en-* 
lujluen^d form of intersubjectivism. But one might also 
equally well say that wo have "developed a>% .enlightened 

of subjectivism— put fleah on the bones of emvathy" 
"(p. 127). ' 

Scrive t bs here suggesting* two different ways of 
locking at the same thing. The idea of dual perspectives'* 
concerning a single phenomenon goes to the -very h'eart of^ 
the 'dichotomy between paradigms. Two scientists may look 
at the same thing, but because of different theoretical 
perspectives, different, assumptions, or different ideo- 
logy-b^sed ^methodologies , they^may literally not see the 
same thing (cf. Petrie, 1972:48). Indeed, Kuhn (19^0 : . 
113) argues that "something like a parUdigm is 'prerequi- 
site to perceptTon itself. What a man sees depends both 
upon what ne looks at arfd % also upon what his previous . ' ^ 
\ isual-conceptual experience has taught him to see. In 
the absence of such training there can only be, in ♦ 
William James 1 phrase, 'a bloomin' buggin' confusion*,,'" 

It is ia this context that the dominant paradigm' 5 
assertion of objectivity can be .called ideolpgy. Such 
ah analysis is based on the relativistic assumption that 
it is not possible for us to view the -complexities of the 
real world without somehow .filtering and* simplifying those 
complexities. That act, of filtering and simplifying af- 
fects what the observer sees because it necessarily brings 
into p*l ay, the observer's past* experiences of the world* 
•Hf'the final analysis, this % position jneans that we are al- 
ways* dealing with perccptiuns, not" 'facts' in some abso- 
lute sense. As Petrie (1972:49) p\it it, "the very cate- 
gories of things' wjiich comprise the 'facts' are theory 
dependent" or, in our terms, paradigm dependent. It is 
' this recognitiorrthat the scientist inevitably operates 
within the constraints of a perception-based paradigm 



ivTith ideological vuvl pol it'ical- underpinnings) .that 'Yeads 
!!tv*ar*l liecker I li>70 : IS) to argue that "the questioners not 
hhet'her we sl^olild^takc 8 sides, since we inevitably will, 
hut rather whose side ue are on. 11 

*It is also in this context^ that the notion of sub- 
jectivity, properly construed, can become a positive ra- 
ther than* a pejorati\e term "in evaluation research. Sub- 
jectivity in the alternative paradigm "allows the resear- 
cher to 'get close to the data, 1 thereby developing the 
analytical , .conceptual , and categorical components from 
the data itself—rather than from tjie preconceived, ri- 
gidly structured, and highly quantified techniques that 
pigeonhole the empirical social world into the operation- 
al definitions that the researcher has constructed 11 
(Jilbtead, 1970:6). Moreover, Vt positive view of eub.jec- , 
r: , "'.Vi — icttiKj tl'jsc to a>xd involved with the data— 
>iv j it iK'Coiblc for evaluation vceeat*chei*s to take into 
iy \ tKi theiv personal insights and behavior. As Scriven 
(1972:99) laments, "For the social, sciences t«o refuse to 
treat their own behavior as data from which one ckh learn 
i.> really tragic." Alvin Gouldner (1970)* is even more 
ad,tmnt on this point. He suggests that "high science me- 
thoJologv" creates a gap between what the researcher as 
scientist deals iv i tli and what that same researched (like 
others) confronts as an ordinary person, experiencing his 
or 4ier ^xicieKce: " 

It is a function of high science* methodologies to 
uiden the gap between what the sociologist is study- 
ing and his own personal reality. EVen if one were 
to assume that this serves to fortify objectivity and„ 
reduce bias,* it seems likely that it has been bought 
at the price^of the dimming of the sociologist's s-elf- 

•auareness. In other words, it seems that, at some 
point, the formula is: the more rigorous the method - 

' ology, the more dhuwitted the sociologist; the more 
reliable hi* information about the social world, the 
less insightful his knowledge Jihout himself (p. 56) 

J * ^ *** * 

To say that the evaluation, researcher can learn much 
by getting -close to the data is not to say that there is 
no systematic way o"f conducting scientific inquiry, that 
anything goes. The point, rather, is to bring the mind 
and feelings of the human being back into the center of 
evaluation research--a center that has thus far been domi- 
nated by techniques and rules. It is to recognize that , 
science is" really nothing if it is not the application of 
critical intelligence to critical problems. The .narrow 
parameters of the dominant paradigm have constrained that 
critical intelligence under the guise of attaining a na- 
tural science objectivity. ' In this regard, C Wright 
Mills (1901:58) quotes Nobel Prize-winning physicist 
Percy Rrklgman to the effect that "there is no scientific 
method as such, but the vital feature of the scientist's 
procedure has been merely to do his utmost with his mind, 

* b : wi\ /." . 
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i Uo ' rJTii..: or understanding approach to scienti- 
lu iiK(iur) is based on the application of critical in- 
telligence to social phenomena without mediation by pre- 
conceived categories and without the abstraction of nunw 
.*rieii representation. This alternative paradigm seeks 
to redraw the boundaries of legitimate .scientific* inquiry 
tfiereby increasing the domain of what has been labeled 
(qualitatively) subjective by the dominant paradigm so 
that ntiny of what have been thought of as illegitimate 
practices and topics .ctfn be tackled. 

Space does not permit a full cpistemologieal ex- 
ploration of the jj^unpnts underlying traditional notions 
of objectivity p\d subjectivity in evaluation research. 
It may be helpful, however, to again use the problem of, 
evaluating innovations in open education to Illustrate 
the different perspectives on objectivity and subjecti- 
vity represented by the two evaluation methodology para- 
digws. The dominant paradigm lauds the use of standard- 
ised tests to measure pupil achievement in school because 
these tests are highly reliable" their outcomes have been 
widely replicated on varying populations, and their sta- 
tist icuU properties are well-known. In brief, standard- 
ised tests represent an objective measure of achievement 
across situations and populations. Standardized tests 
properly administered minimize the introduction gr re- 
searcher bias in -measuring achievement. 

However, standardized tests can buis evaluation re- 
sults by imposing a standardized and controlled stimulus 
in an environment where learning depends off spontaneity, 
creativity, and freedom of expression, -as Shapirt) (19Z5) 
found in her study of 'innovative Follow Through >£iass~ 
roons described earlier. Moreover, . she found that there- 
suits of the test measured response to -a stimulus (the 
test) uhich was essentially alien to the experience of 
the children! Because the open classroom relies substan- 
tially less on gaper-ar^l -pencil skills and becpyse stu- 
dent progress is monitored on a personal basis without 
t'vj u*e of written examinations, student outcomes in the ; 
open classroom could not be "objectively" measured by stan- 
dardised tests. Such tests fail to delineate the learning 
outcomes of children who make differential uses ,of particu- 
lar classroom situations. Shapiro Argues that "the quest 
for objective control over the multiplicity of interdepen- 
dent events occurring in a classroom has led to a concen- 
tration on ever smaller units of behavior, divorced^ from 
uji.tt-xt and sampled in rigorously scheduled time units 
fp. 3431." ' 

The actual behaviors of children observed in the open 
classroom situation were not validly captured by standard- 
ized ttrjts or one-to-one interviews with adults, even when 
■"the interviewer was someone who was familiar to the chil- 
dren. For the children in open classrooms, "the transi- 
tion from the relatively free and easy exchange of the 
classroom to the more constricted interview was not auto- 
matic; it was, in fact, not possible (p. 539)." Del llymes 
ll'iTlrSfc) describes this kind, of situation in more tcchm- 
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. il language. * "When a. child from one developmental matrix 
l ntcr. a situation in which the communicative expectations 
are defined in terms of another, misperception and misanal- 
ysis nay occur at every' level. . . ; intents and innate abi- 
lities may be nisevaluated because of differences of sys- 
tems Coi* the usu.pf language and for the import of its use 
(a^ against other modalities)." 

Ihe problem is not simply one of finding a new or 
better standardized test. The problem is onp of understan- 
ding the context of observed behaviors, the meaning of spe- 
cific achievement outcomes to the child in a more holistic 
netting than is possible with any standardized test. This 
does not mean that standardized tests may x not be useful 
for certain specific questions, but they are not suffi- 
cient when the issue is muLx* standing, not just predic- 
tion .njcpctanii'ij 'n its broadest sense requires get- 
::kj jIjjc enough to the situation to gain insight into 
'^KtA staicc; it "Vlpjs oubJeHLvity in the best scienti- 
st j c n ^rj sf the tem. The alternative paradigm seeks to 
legitimize and incorporate this subjectivity intQ evalu- 
ation research, not to the exclusion of the methodology of 
the dominant paradigm, but in addition to it. 

If a limited notion of subjectivity based on care- 
ful and systematic observation by trained researchers in 
the best tradition of anthropological research cannot be, 
made a legitimate parjt of evaluation research, then a 
host of crucial questions will be excluded from investi- 
gation, "If we cannot straighten out the situation," 
Sc riven (1972:97) warns, "we are doomed to suffer from the 
;>wmg of the pendulum in the other direction, a swing 
which it is easy to see implicit in the turn toward irra- 
tional i^ic, mystical, and emotional movements thriving 
in or on the fringes pf psychology today. There is much 
good in them on their own merits, but the ideology that 
is used to support them is likely to breed the same in- 
tolerance and repression i\hat the positivists spread 
through epibtemology and psychology for a quarter century." 
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7 . . 

Distance from vs. Closeness to the Data 



There are several additional paradigm components that have 
emerged in the^discussion of quantitative versus qualita- 
tive methodology, and reliability versus validity, and ob- 
jectivity versus Subjectivity that^deserve additional com- 
ment. One of these involves the issue of how close the 
investigator should get to the data. The dominant para- 
digm prescribes distance to guarantee neutrality and ob- 
jectivity. This component of the dominant paradigm has 
become increasingly "important with the professional ization 
of the social sciences^ and educational research establish- 
ment. Professional comportment connotes cool, calm, and 
detached analysis without personal involvement. The pro- 
fession is identified by and takes pride in its skills-- 
in this case quantitative methodology and empiricism — 
not in its Ability to serve the needs of 'clients 1 (cf. 
Horowitz, 1964:10-11), 

Alvin Gouldner (1970:53) suggests that this empha- 
sis on detachment and professional distance iSvthe social 
scientist's way^of^accommodating himself to his aliena- 
tion in contemporary society, a reaction to "maa's fail- 
ure to possess the social world that he created. 11 This 
alienation is built on the notion that society and cul- 
ture can be viewed like any other 'natural' phenomena', as 
having laus that operate quite apart from the intentions, 
motivations, and plans of human beings. Methodology fol- 
lows this assumption by emphasizing prediction and uni- 
■ versa! laws rather than understanding and human meaning., 
Horowitz (f§65 : 1 1) is less kind, emphasizing the elitism 
and arrogance of social scientists as they disguise their 
search for status and professional prestige behind a thin 
veil of neutrality and detachment. 

Whatever the source of the emphasis on distance and 
detachment in the dominant paradigm, its centrality to 
that methodology can scarcely be questioned. What is 
questioned by the alternative paradigm is the necessity 
of distance and detachment. The alternative paradigm as- 
sumes that without empathy and sympathetic introspection 
derived from personal encounters the observer cannot-j ful- 
ly understand human behavior. Understanding comes from 
trying to put oneself in the other person's shoes, from 
trying to discern how others think, act, and feel. John 
Lofland (1971) explains that methodologically this means 
1) getting close to tlie people being studied through at- 
tention to the minutia of daily life, through physical 
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proximity over a period of time, and through develop- 
ment of closeness in the social sense of intimacy and 
confidentiality; 2) being truthful and factual about 
what is observed; 3) emphasizing a significant amount of 
pure descr'ption of action^ people; activities, etc.; and 
4) including as data direct quotations from participants 
as they spe^k and/or from whatever they might write. "The 
commitment to get close, to be factual, descriptive, and 
quotive, constitutes a significant commitment to repre- 
sent the participants in their own perms" (p. 4). 

The commitment to closeness is further bast)d upon 
the assumption that the inner states of people are impor- 
tant and knowable. It is at this point that the alterna- 
tive paradigm intersects with the phenomenological tradi- 
tion (cf. Buss is,~^hit tenden, and Amarel v 1973). Atten- 
tion to inner perspectives does not mean administering at- 
titude surveys. 'Thp inner perspective assumes that un- 
derstanding can tfnl'y be achieved by actively participa- 
ting in the life of the observed and gaining insight by 
means of introspectipn" (Bruyn, 1963:226). 

A commitment to get close to the data and a will- 
ingness to capture participants in their own terms implies 
an openness to the phenomenon under study that is rela- 
tively uncontaminated by^ preconceived notions #nd cate- 
gories. "In order to capture the participants ! in their 
own terms 1 one must learn theiv analytic ordering of the 
>orld, thiii* categories for rendering explicable and co- 
herent the flux of raw reality. That, indeed, is the first 
principle of qualitative analysis" (Lofland, 1971:7). 

In the Shapii'0 study of open Follow Through class- 
rooms, it was her closeness to the classrooms un.der study 
and the children in those classrooms that allowed her to 
see that something was happening that was not captured by 
standardised tests. She could sec differences in child- 
ren. She could undo rat and differences in the meaning of 
their different situations. She could feel their ten- 
sion in the testing situation and their spontaneity in 
the more natural classroom setting. Had she worke'd sole- 
ly with data collected by others, had she worked only at 
a distance, she would never have discovered the crucial _ 
differences in the classroom settings she studied—dif- 
ferences in modes of achievement which actually allowed 
her to evaluate the innovative program in a meaningful 
and relevant way. 

Again, it is important to note that the admonition 
to get close to the data is in no way meant to deny the 
usefulness of quantitative methodology. Rather, it is to 
say that statistical portrayals must always be interpre- 
ted and given human meaning. That many quantitative me- 
thodologists fail to ground their findings in qualitative 
understanding poses what Lofland calls a major contradic- 
tion between their public insistence on the adequacy of 
statistical portrayals of other humans and their personal 
everyday dealings with and judgments about other human 
beings; 
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1 In everyday life, statistical 'sociologists, like 
everyone else, assume that they* d9 not know or un- 
derstand very well people they do not see or asso- 

' ciate with very much. They assume that knowing ?nd 
understanding other people require that one see them 
reasonably often and in a variety of situations rela- 
tive to a variety of issues. Moreover, statistical' 
sociologists, like other people, assume that in order 
to know or understand others one is well advised to 
give some conscious attention to that effort in face- 
to-face contacts. They assume, too, that the inter*- 
nal world of sociology—or any other social world- 
is not understandable unless one has been part of 
it in a face-to-face fashion for quite a period of 
time. How utterly paradoxical, then, for these same 
persons to turn around and make, by implication, pre- 
cisely the opposite claim about people they have 
never encountered face-to-face— those people appear- 
ing as numbers in their tables and as correlations 
in their~matrices! (Lofland, 1971 : 3* ) 

This returns us co the recurrent theme of matching 
the evaluation methodology, to the problem. The highly in- 
formal, per;s6nalized environment of open education obvi- 
ously lends itself to a more personalized evaluation me- 
thodology built upon close observer-student and observer- 
teacher interaction. Such a personalized evaluation is 
important not only for the insights it can* generate but 
because a personalized evaluation that takes the obser- 
ver close to. the data is the only evaluation research 
likely to be perceived as legitimate by program partici- 
pants phenselves. To the extent that judging the quality 
of evaluation research -includes judging its legitimacy and 
usefulness to program participants—and we would argue that 
this criteria should be central—then the Matching of eval- 
uation methodology to the nature of' the program being eval- 
uated is also central. 

Finally, in thinking about the issue of closeness to 
the data, it is useful to remember that many major contri- 
butions to our understanding of the world have cane from 
scientists' personal experiences. One finds many instan- 
ces where closeness to the data made possible key in- 
sights— Piagets' closeness to his own children, Freud's 
proximity to and empathy with his patients, Darwin f s 
closeness to nature, even Newton's intimate encounter with 
an apple. The distance prescribed by the dominant para- 
digm makes such insights derived from personal experi- 
ence an endangered species. 
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Holistic vs. Component Analysis 



Nowhere is the need to match methodology and problem more 
evident thqpv'in the dichotomy represented by holistic ver- 
sus component analysis, Component analysis achieves its 
highest expression in classical Fisherian experiments us- 
ing factorial designs, the most highly lauded of all eval- 
uation designs (cf, Rossi, 1972:46). Experimental designs . 
by their nature usually focus on some narrowly defined set 
of variables, at least one 'of which is the "treatment . 
Causes are separated from effects, and both cauSe vari- 
ables have to be carefully delimited and operationally de- 
fined. 

Treatments in educational research are usually some 
type of new hardware, a specific curriculum innovation, 
variations in class size, or some specific type of teach- - 
ing style. One of the major problems in experimental edu- 
cational research is clear specification of what the 
treatment actually is, which infers* controlling all other 
possible causal variables and the corresponding problem 
of multiple treatment interference and interaction effects. 
It is the constraints posed by controlling the specific 
treatment under study that necessitates simplifying and 
breaking down the totality of reality into small component 
parts. A great deal of the scientific enterprise revolves 
around this process of simplifying the complexity of re- 
ality. Wt)ile this process is inevitable, it is also dis- 
torting, And it is the narrowness of focus in most ex- 
periments, with all their artificial controls and isola- 
ted treatments, th*i£ leads to the preponderance of "so 
what? 1 ' results, even pn those rare occasions whpn signifi- 
cant differences in treatments are uncovered. The addi- 
tional questions of the relevance of laboratory experiments 
for field settings only increases the distance between 
what is evaluated in most experiments and what actually 
happens in most classrooms or social action programs. De- 
spite the dismal, disappointing, Jargely meaningless, and 
irrelevant (from the point of view of practitioners) re- 
sults of thousands of educational experiments and quasi- - 
experiments, the spokesmen for the dominant paradigm still 
argue that such designs are "die "only available route to 
cumulative progress" (Campbell and Stanley, 1966;3), 

Clearly there are questions of major import that do 
not lend themselves to experimental design or even less 
rigorous quantitative methodologies that focus on a limi- 
ted number of narrowly defined variables. The simplified 
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i.orh! of variables, causes, and effects, in which the sci- 
entists of the dominant paradigm operate is- alien to most 
teachers and change agents, Evaluations that are rele- 
vant and meaningful to the total context in which innova- 
tions occur need to include a holistic methodological ap- 
proach built on the functioning, day-to-day world of pro- 
gram participants. f 

? \ holistic evaluation methodology is particularly 
crucial for holistic program innovations—like open edu- 
cation. Open education is an alT-encompassing innovation. 
It involves not only changes in curriculum, materials, and 
methods, but also changed social relationships that affect 
the entire structure of the child's .learning environment, 
open education meaps new roles for teachers and learners,, 
changed status arrangements in the classroom, a new set 
of norms, new expectations, and different criteria for 
evaluation. Interactions among students and the relation- 
ships between students and teachers are changed. Under 
conditions of such all-encompassing innovation it is im- 
possible to specify what the treatment is. Moreover, it 
i> impossible tocarefull>A isolate and control, component 
parts. of open classrooms because the parts are so inter- 
dependent and interacting (cf, Patton, 1973). 

fp evafuate the meaning of open education as a ho- 
listic phenomenon requires a methodology that gets close 
to the classroom experience of children, a methodology of 
participant observation, in-depth interviewing, and care- 
ful descriptive* detail that is subjective in the sense 
I specified earlier--the sense, of discovering the meaning 
of the classroom experience from the point of view of the 
children* and teachers, 

\ holistic evaluation methodology attempts to 
•transcend the artificial conflicts in modern schools de- 
scribed by John Dewey in The Child and the CuvHculivn. 
M Ke get * the ca'se of the child vs, the curriculum; of the 
individual nature vs, social culture. Below all other di- 
visions in pedagogic opinion lies this opposition" 
IPcttcy, 1956a:5), A major component of this artificial 
conflict, for Dewey, was the division and specialization 
of subject matter in the curriculum. Academic divisions, 
he argued, are alien to the nature of the child: 

Again*, the child's life is an integral, a total one. 
. ' lie passes quickly and readily from one topic to an- 
other, as from one spot to another, but is not cons- 
cious of transition or break. There is no conscious 
isolation, hardly conscious distinction. The things 
that occupy him are held together by the unity of the 
personal and social interests which his life carries 
along, (His) universe is fluid and fluent; its con- 
tents dissolve and re-form with amazing rapidity. But 
after all, it is the child f s own world. It has the 
unity and completeness of his own life (pp, 5^6), 

In contrast to the wholeness of the child's percep- 
tions and experiences, "he goes to school', and various 
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■tudu divide and fractional ize the world for him" 
(p. fr). I>ewey argued that in contrast to the school's 
nethods of specialisation and division, "the only signi- 
ficant method is the method of the mind as it reaches 
out-und assinn lates. . . , It is because of this (speciali- 
zation) t h a t^ 'study' has b ecome,^ . synonvm for what is irk 
*^ttr,L i , ami a lesson identical with a task 11 (p. 9): 

Abandon the notion of .subject-matter as something 
fixed and ready-made in itself, outside the child's 
experience; cease thinking of the child's experi- 
ence as something hard and fast; see it as something 
fluent, embryonic, vital; and we realize that the 
chil>i and the curriculum are simply two limits 
which define a single process (p. ID- 

, Respite the totality of our personal experiences as 
lining, working human beings, we have focused in evalua- 
tion research on parts, not only instead of wholes, but 
to the virtual exclusion of wholes. "We knew that human 
behavior was rarely "if ever directly influenced or ex- 
plained by an isolated variable; we knew that it was im- 
possible to assume that any set of sucb variables was ad- 
ditive (with or without weighting); we knew that the com- 
plex mathematics of the interaction among any set of vari 
.abies, much less their interaction with external vari- 
ables, was ihcompi-ehensible to us. In effect, although 
ue knew they did not exist, we defined them into being" 
t Petit i>cher, 1970:33). 

While the radical critique of component analysis 
made by Ueutschcr in the last paragraph will be consider- 
ed unacceptably extreme by most scientists, I find that 
teachers and practitioners voice the same cinticisms a- 
bout the bulk of evaluation research. Narrow experimen- 
tal results lack relevance for innovative teachers be- 
cause they have to deal with the whole in their class- 
rooms. The reaction of these teachers to scientific re- 
search is like the reaction of Copernicus to the astro- 
nomers of his day: "With them," he observed, "it is as 
though an artist here to gather the hands, feet, head, 
and other members for his images from .diverse models, 
each part excellently drawn,- but not related to a single 
bodv, and since they in no way match each other, the re- 
,sult uould be monster rather than man 1 '" (cf. Kuhn, 1970: 
S3} . What teacher has not complained of the educational 
evaluation monster? 

It is no simple task to undertake holistic evalua- 
tion, to search for the %:c>t<ilt in innovative classrooms 
3nd "program innovations. The challenge for the partici- 
pant observer is "to seek the essence of the life of the 
observed, to sum up, to find a central unifying princi- 
ple" iBruyn, 1070:316). 

\gain the work of Shapiro in evaluating innovative 
Jul low Through classrooms is instructive. She found that 
tejt ivstiit^ could not be interpreted without understand- 
ing the larger cultural and institutional context in 




which the individual child is situated: 



The relevance and appropriateness of the classroom 
and the test situation as locations for studying 
the impact of schooling on children requires re- 
evaluation. Bach can supply useful information, but j 
in both situations the evidence is situation-bound. 

Ne ither yi elds pure measures, and it is necessary to 

consider the type of school situation the children 
are in and their developmental status, as well as 
the social and sociological factors that determine 
or have determined the children ! s expectations, per- 
ceptions, and styles of thinking and communication 
with other children and adults. What may be an ap- 
propriate situation for assessing some 'groups may 
lead to misevaluation of others. The standard, test, 
given under optimal conditions, may offer moderately 
valid estimates of competence for middle-class chil- 
dren (though every psychologist is aware of at least 
a few cases of gross misevaluation). Its adequacy 
and appropriateness may depend on unspecified built- 
in lines of continuity between middle-class cultural 
expectations and the demands of the test situation, 
rather th&n on intrinsic characteristics of the test 
itself. For lower-class children of different back- 
■ grounds there may be no comparable set of connectives, 
or the test situation may call for a type of response 
which is not valued in the child's cultural milieu. 
It is an old chestnut that psychological dimensions 
cannot be defined in terms of their physical equi- 
valence; psyehdlogists who are trying to study the 
iripzat of different kinds of experience on different 
kinds of children must be dhle to shift their expec- 
tations and tools depending on the contexts in which 
theu 2rc l tiorking. (Shapiro, 1973:541.) 

.Neither, the holistic approach nor component analy- 
sis represents an omnibus strategy appropriate to all 
situations and problems. But in reaction to the dominance 
of component analysis a£ The Scientific Method in evalua- 
tion research this paper has emphasized the potential for 
more holistic evaluation strategies for holistic program 
innovations. 
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Process vs. Outcome Evaluation 



The dominant scientific paradigm in evaluation research 
is preoccupied with outcomes. As with component analy- 
sis, the highest expression of this .preoccupation is 
found in experimental designs. « There is a pre-test, a 
treatment, and a post-test. The scientific observer en- 
ters the picture at two points in time, pre-test and post- 
test, and compares the treatment group to the control 
group on post-test measures. As already noted, such de- 
signs assume a single, identifiable, isolated, and mea- 
surable treatment. What'S.more, ouch designs assume that 
once introduced, the treatment remaihs relatively constant 
and unchanging. 

While there are some narrow educational treatments 
that fit this description, more encompassing program in- 
novations in practice are anything but static treatments. 
Frequently, by the time innovations are put into practice, 
they are already different than they appear in program pro- 
posals. Once in operation, innovative programs are fre- 
quently changed as practitioners learn what works and what 
doesn't, as they experiment and grow and change their pri- 
orities. 

All of this, of course, provokes nearly unlimited 
frustration and hostility from scientific evaluators who 
need specifiable, unchanging treatments to relate to spe- 
cifiable, pre-determined outcomes. Because of a commit- 
ment to a single evaluation paradigm evaluators are fre- 
quently prepared to actually do everything in their power 
to stop program adaptation and improvement so as not to 
interfere with their research design (cf. Parlett and 
Hamilton, 1972:6). The deleterious effect this may have 
on the program itself by discouraging new developments 
and redefinitions in mid-stream is considered a small sac- 
rifice to be made in pursuit of hi,iier level scientific 
knowledge. The arrogance and insensitivity of evaluators 
at such times--which are considerably more frequent than 
one might suspect--are all the more inexcusable when one 
considers that such interventions probably have already 
contaminated the treatment by affecting staff morale and 
participant autonomy, t* 

Were some science of planning and policy/program de- 
velopment so highly developed that initial proposals were 
perfect, one might be able to sympathize with the desire of 
evaluators to keep the initial program implementation in- 
tact. In the real world, however, people and unforeseen 



. iiwiM.!.i!iiei shape program* and initial implementations 
f.-i>t be .lodificd in uav--j that are rarely trivial. Nor is 
the task of' program administrators and participants to . 
shape their program to the needs of evaluator*./ 'Rather 
the ta-k of valuators us to shape their evaluation me- 
thoikilogu-t to fit programs. - 

Under field conditions where^programs are subject to 
chome and redirection, the alternative evaluation para- 
a i ; yn fc Replaces the outcome emphasis of the dominant para- 
Jign with a process orientation. Rvocess evaluation is 
not tied to a single treatment and pre-determined goals or 
outcome*. Process evaluation focuses on the actual opera- 
turn^ of a program over a period of time. The evaluator 
^cts out to understand and document the day-to-day reality 
of the netting or settings under study. Like the anthro- 
pologist, the process evaluator makes no attempt to mani- 
pulate, control, or eliminate situational variables or 
program developments, but takes as given the complexity of 
j changing reality. The evaluator tries to unravel what 
actually happens;' he or she never takes for granted the im- 
plementation of a proposed treatment or innovation. The 
data of the evaluation are not just outcomes, hut changes 
in treatments, patterns of .action, reaction, and interac- 
tion. Under some conditions the initial and on-going ob- 
servations of the evaluator can even serve as a source of 
program improvement- -an impossibility under most control- 
led, static experimental designs. 

In short, process evaluation requires sensitivity 
to both qualitative and quantitative changes in programs 
throughout their development, not just at some endpoint 
m tine; it is built on subjective inferences in the sense 
that the investigator attempts to develop empathy with 
program participants and understand the changing meaning 
of the program in the participants 1 own terms; it requires 
getting close to the data, becoming intiirfately acquainted 
witv the details of the program; it includes, a holistic 
orientation to evaluation research, looking at not only 
anticipated outcomes but unanticipated consequences, , 
treatment changes, and the larger context of program im- 
pl orient at ion and development. 
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'Uniqueness vs.* Generalization 



\hv thrust of the dominant paradigm in evaluation re- 
.carch ,is a concern with discovery of scientific laws 
and t injuries. The Scientific Method is applied to un- 
covur fynt terns of behavior; the ideal is to so specify 
md Wctctify factors of social causation that the re- 
scaf^li •s<{ip:Uist can explain 100 percent of the variance 
in>soccial pfittaomena. The scientist in this instance selr 
don cons idei.v what a dismal world it wouTd be. if we 
could indeed account for 100 percent of the variance in 
human behavior. 

I he dominant paradigm is directed* at producing/gen- 
eruli zations. The assumption that this is the goal of 
Science is so deeply ingrained that it is virtually true 
by definition. I have never seen this assumption ques- 
tioned in the literature on'Scientific Methodology. Sci- 
ence £.7 the search for generalizations. 

Yet as human beings we place immense value on our 
individuality. Philosophers suggest that the greatest 
contribution of Western culture and civilization is the 
value it places on the individual . The rhetoric of* edu- 
cational innovation raid social action programming is re- 
plete with references to reaching -and serving individual 
clients. It strikes me that this emphasis on the indivi- 
dual has important implications for humanistic ^evaluation 
research. f , 

Evaluation research studies in the tradition of the 
dominant paradigm report virtually nothing but norms, 
standards, surveys, and prediction equations. "But this 
very intci<.st perhaps unduly distracts attention from the 
degree to which education is idiosyncratic as well as no«- 
mvthetic. Teachers rarely feel they axe facing merely 3 
to 300 incarnations of points on a distribution; they 
hope they are educating Johnny Johnson and Suzy §mith. 
But, by ♦'hose espousing the narrow definition fof Sci- 
ence, i.e. the dominant paradigm), dealing with the in- 
dividual is usually considered an' affair of, art (med.icinc 
curing this patient) or technology (engineering building 
this bridge); the whole conceptual^ apparatus of science, 
alon£ with its counterparts in ^educational philosophy and 
educational research, is often seen as inapplicable" 
(Uunkcl, 1<J72;S0), i & 

In technical* terms educational researchers some-"*' 
times recognize individuality when they discuss "disordi- 
nni interactions," i.e. treatments interacting with per- 



^onologieal variables in educational experiments. This / 
simply means that there may be some innovations that work 
better for certain types of students rather than showing 
across-the-board effects. Both Cronbach f 1966) and Kagan 

, (1966) have expressed the belief that the discovery method 
works better for some students than for others; some stu- 
dent* will perform better with iifductive teaching, and 
some will respond better to didactic .teaching. Stolurow 
(19t>5) also has suggest^! that learning strategies inter- 
act with personologicai'or individual variables. 

^ Though such' suggestions are hardly news to teachers 

(they know that 'different kid$ learn in different ways, I 

though they don't always know how to take those differ*- 
ences into account in their teaching), disordinal inter- 
actions have rarely been uncovered in experimental re- 
search. Bracht and Glass (1968:449) report .that while. * 
there are convincing arguments as to why 'one should ex- 
pert disordinal interactions, "the empirical evidence for 
disordinal interactions is far less .convincing than the 

arguments " In point of fact, the actual search for 

disordinal interaction is r,are--most researchers don't 
bother with the difficult statistical analyses necessary 
or don't measure relevant variables--and "the molarity 
(as opposed to the moleculaHty) of.both personolbgical 
variables and the treatments incorporated into many ex- , 
periments may tend to obscure disordinal interactions 
which might be observable when both the variables and the 
treatments are more narrowly ddfined"' (Bracht and Glass, 
196&:451) . Bracht and Glass (1968:452) conclude that 
"searching for such interactions, with treatments as ne- 
cessarily complex as instructional curricula may be fruit- 
less." 

In effect, Bracht and Glass prefer to dismiss the 
question rather than call into question the methodology 
that fails to find and predict individual differences. 
But for teachers, -particularly teachers in innovative pro- 
grams of open, informal > and humanistic education the 
question will not go away. Indeed, for these teachers the 
central issue in the educational process is how to identi- 
fy and deal with individual differences in children. Any 
serious and prudent observer knows that such differences 
exist, but experimental designs consistently fail to un- 
cover them. Is it any wonder that practitioners find so 
jpuch of educational evaluation useless and irrelevant? 

Where the emphasis is on individualization of teach- 
ing or meeting the needs of individual welfare recipients 
--the 'clients' in social action programs, an evaluation 
strategy is needed that can take the individual into acr 

* count. An evaluation methodology that takes the indivi- 
dual into account must be sensitive to uniqueness in both 
people and programs as well as similarities among people 
and generalizations about treatments. This is not a call 
for psychological reductionism, but rather an expression 

• of what C. Wright Mills (1961) called "the sociological 
imagination"— a' focus on the intersection of biography and 
history; attention to the interaction of the individual 

36 



and social structure. • 

The al tentative paradigm of evaluation research can 
take accobnt of the individual through its commitment to 
get close to the data , to be factual, descriptive, and 
quotive, i.e. to represent participants in their own 
teme. Lofland (1971:4)', in describing such a humanistic 
approach to scientific research, argues that: 

h 

...this, does not mean that one becomes an apologist 
for them, btft^PSlther that one faithfully depicts 
what goes on in their l^ives and what* life is like for 
them, in such a way that one 1 ^ audience is at least 
partially able to project themselves into the point 
of view of the people depicted. 

They can 1 take the role of the other 1 because the 
reporter has given them a living sense of day-to-day 
talk, day-to-day activities, day-to-day concerns and • 
problems. The audience can know the petty vexations * 
of their existence, the disappointments tfcat befall' 
them, the joys and triumphs they savor, tne typical 
contingencies they face. There is a conveyance of 
their prides K their shames, their secrets, their 
fellowships, their boredoms, their happinesses, their 
despairs. .. .It is the observer's task t.o find out 
what is fundamental or central to the people or 
world under observation. 

One of the effects of the overriding Concern with 
finding generalizations in the dominant paradigm has been 
emphasis on ever larger samples, inclusion of an ever-in- 
creasing number of cases in research studies, and the 
concommitant ever greater distance from and quantification 
of the data. The case study has fallen into disrepute in 
social science* Yet for certain types of questions, case 
studies in evaluation research are still very much needed* 
When the evaluation is aimed at improvement of a specific 
program, or when the inforaiatiQn. collected is for partici- 
pants and not just scientists, and the concern is for in- 
dividuals not just broad generalizations, then a case 
study approach that identifies uniqueness and' idiosyn- 
cracies can be invaluable. Case studies can and do accu- 
mulate. Anthropologists have built up an invaluable 
wealth of case study data that includes both idiosyncra- 
tic information and patterns of culture. There is every 
reason to believe that the young discipline of evalua- 
tion research would be well served by a similar approach* 
More important is the likelihood that an in-depth case 
study would better serve program administrators and par- 
ticipants than the large-scale comparative studies aimed 
at finding similarities across program treatments. Not 
the least benefit of using the alternative paradigm is 
that the results are readily understandable to program 
participants and that their alienation from science and 
scientists is likely to be diminished--a humanistic con- 
sideration that has received little more than lip-service 
in most evaluation research. 
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Evaluation for Whom and for What? 



N 



The unanswered question underlying all of our discussion 
is for whom and to what end evaluative research is under- 
taken. It is a platitude in the evaluation literature 
that evaluative research should serve both scientists and 
practitioners. In reality, the needs of these two groups 
are frequently quite different. The dominant , paradigm 
serves to delineate accepted and acceptable scientific 
practice/ l\\ terms of career considerations, personal le- 
gitimacy, and professional commitments, social scientists 
and educational researchers are^est able to meet their' 
needs by adherence to the prescript-ions and standards of 
*the dominant paradigm. /flie nature of funding in most ma- 
jor evaluative research reinforces this emphasis by re- 
warding grandiose designs, elegant sampling, and.sophis- 5 
ticated quantitative methodological procedures. Such 
evaluations'—frequently national in scope~-focus on 'out- 
comes assessment and summative evaluation. Such evalua- 



Quite a different strategy is required where evalua- 
tion is aimed at serving and informing teachers and pro- 
gram .practitioners about progress and functioning, areas 
Of competence and confusion, attitudes, feelings, and 
practices which may be related to maximizing what the 
school o,r program has to offer. Evaluations that are to 
be uS-eful to specific practitioners must bq focused at 
the local level. They must include description and analy- 
sis of lochl settings. They must take account of what 
happens in programs on a day-to-day basis. We particu- 
larly need to be able to describe context, treatment, and 
outcomes in ways that are understandable, meaningful, and 
relevant to practitioners. The major value of this kind 
of program eva-luation at this local level is its contri- 
bution to program development, not its labeling' of suc- 
cesses and failures. The possibility for meaningful and 
useful feedback can occur only if evaluation research is 
tied to specific programs. It is also only at the local 
level that the decision of when to measure program im- 
pact can be made. National schedules for impact assess- 
ment almost invariably ignore variations in nature and de- 
gree of real program implementation. 

While it is at the local level of immediate program 
evaluation that the alternative paradigm is most useful, 
this does not mean that it serves practitioners at the 
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cxpeiLse of generating scientific knowledge of interest to 
the larger community. At the present stage of develop- 
ment of an interdisciplinary approach to evaluation re- 
search, with so little known about what constitutes a 
treatment or outcome and hoW evaluators can best measure 
these artifacts of social intervention, the alternative 
paradigm holds forth the promise of an accumulation of 
rich documentation that can serve well the largei*goals 
of the scientific community. 
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Conclusion 



I have outlined two paradigms of evaluation research. To 
facilitate analysis and discussion I have looked at these 
paradigms through a set of dichotomies: qualitative vs. 
quantitative methodology, validity vs. reliability, sub- 
jectivity vs. objectivity, closeness to vs. distance from 
the data, holistic vs.* component analysis, process vs. 
outcome evaluation, and research for practitioners Vs. re- 
search for scientists. In reality thes.e are not dichoto- 
mies but continua along which evaluations and scientists 
vary. 

As ideal -types, however, these dichotomies- allow a 
kind of dialectical approach to consideration of the pro- 
blem of competing ^paradigms. m Though I have suggested only 
vaguely some possibilities for synthesis, my purpose has 
not been to undermine the dominant paradigm, but rather to 
plea for legitimacy for the alternative paradigm. Most 
important, I have 'argued that the evaluation strategy must 
be matched to the nature and ne^s of the evaluation pro- , 
blem and program setting. 

Neither paradigm can meet all evaluation needs. The 
two paradigms have different strengths and weaknesses. It 
is my position that the strengths of the dominant paradigm 
do not justify its overwhelming monopolization of evalua- 
tive research and that the weaknesses of the altem&t'ive 
paradigm do not justify its current subordination. 

Yet, as in any paradigm debate, great passions are 
aroused by advocates on each side. Kuhn (1970:109-110) 
tells us^ that this is the nature of paradigm debates: 
"To the extent that two scientific schools disagree about 
what is a problem and what a solution, they will inevi- 
tably talk through each other when debating the relative 
merits of their respective paradigms. In the partially 
circular arguments that regularly result, each paradigm 
will be shown to satisfy more or less the criteria that 
it dictates for itself and to fall short of a few of 

these dictated by its opponent Since no paradigm ever 

solves all problems it defines and since no^twp paradigms 
lelive all the same problems unsolved, paradigm questions^ 
always involve the question: Wiioh ovoblems is it move * 
significant to have solved?" 
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Evaluation jories: 

Observation and Description: An Alternative Methodology 
for the Investigation of Human Phenomena 
Patricia F. Carini 

A Handbook on Documentation 
Brenda Ongel 

An Open Education Perspective on Evaluation 
George E'. Hein 

Deepening the Questions About Change: Developing the * 
Open Corridor Advisory 
Lillian Weber * 

The Teacher Curriculum Work Center: A Descriptive Study 
Sharon Feiman * 

Single copies $2, from Vito Perrone, CTL 
U. 'of North. Dakota, Grand {forks, N.D. 58201 
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